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A new technique for very fast start-up of adaptive transversal-filter 
equalizers used in high-speed synchronous data communications is pre- 
sented. A special training sequence whose period in symbols is equal to 
the number of equalizer taps is used initially to achieve an open eye 
pattern. Rapid convergence, even over highly distorted channels, is obtained 
because an ideal reference sequence is available at the receiver, but it is 
not necessary to synchronize the ideal reference with the received sequence. 
The special choice of the training sequence automatically provides the 
synchronized ideal reference needed for fast convergence, but the resulting 
equalizer coefficients may be cyclically displaced from their proper posi- 
tions. After the eye is opened by this process, the equalizer coefficients are 
rotated to their proper positions, and decision-directed equalization is used 
with either a longer training sequence or random data to achieve final tap 
settings. Adjustments during the training period can be made with a 
gradient-type algorithm or ivith stochastic adjustment techniques; an 
exact analysis is possible for both of these schemes. Cyclic equalization is 
shown to provide perfect equalization at evenly spaced points in the fre- 
quency domain. 

I. INTRODUCTION 

The effective data throughput in polling systems is, to a large degree, 
dependent on the start-up time of the data modems that are used in 
the network. Many of these systems operate at high speed and trans- 
mit data blocks of comparatively short duration. At 4800 b/s, a 1000-bit 
block is transmitted in about 200 ms, and to achieve a reasonable 
overall efficiency, the time needed to condition the modem for trans- 
mission (start-up) should be short in comparison to the time required 
to transmit an average block. This becomes increasingly difficult with 
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higher modem speeds. Prior to the transmission of the actual data, 
timing and carrier information must be recovered very accurately, and 
the adaptive equalizer that is necessary to cope with the linear channel 
distortion at such high speeds must be trained. 

The time required to adjust the equalizer represents the bulk of the 
modem start-up time ; it is thus important to study in detail the prob- 
lems associated with fast equalizer start-up. The most common struc- 
ture of such an equalizer consists of a transversal filter with a set of 
controlled gain coefficients that are spaced at the symbol interval T. 
and the start-up problem is to find an initial set of "reasonably good" 
values for these coefficients in a very short time. The purpose of this 
paper is to present a practical method for doing this. 

We first provide some background and discuss some factors that 
affect equalizer start-up. This leads to the principle of cyclic equaliza- 
tion that is discussed in Section III. Sections IV through VIII discuss 
the operation of the cyclic equalizer using the mean-square tap-adjust- 
ment algorithm where averaging is used to compute the adjustment 
signals. The optimum tap coefficients are discussed and shown to 
provide perfect equalization of the channel at equally spaced points 
in the frequency domain. The relationship is explained between the 
eigenvalues of the channel correlation matrix, which control the con- 
vergence of the adaptive algorithm, and the discrete Fourier transform 
of the received training signal. Selection of the training sequence and 
the starting values of the tap coefficients and the effects of noise are 
also discussed. Finally, in the remaining sections, a more practical 
implementation is analyzed of the cyclic equalizer that does not use 
averaging in the tap adjustment algorithm (stochastic adjustment). 
The analysis of this algorithm is, in general, very difficult but, in the 
case of the cyclic equalizer, the time-varying difference equation that 
describes the noiseless equalization process can be solved exactly, and 
the conditions for this algorithm to be stable can be developed. Again, 
here the stability of the algorithm is related to the discrete Fourier 
transform (dft) of the received signal. It is shown that the algorithm 
converges if the dft of the received signal has no zero elements — that 
is, if the received signal spectrum has no nulls. This material along 
with a brief discussion of the asymptotic behavior of the algorithm is 
given in Sections IX through XII. 

Within the paper, we also discuss various implementations, includ- 
ing a method to further speed up the tap calculation using an acceler- 
ated signal-processing technique. It will be seen that cyclic equalization 
is very attractive and economical to implement. Actual convergence 
is presented with some computer simulations that demonstrate the 
fast start-up capabilities of the new method. 
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We will consider the pulse-amplitude modulated data system shown 
in Fig. 1. Data symbols, d k , are transmitted every T seconds through 
a transmitter low-pass filter. This signal then passes through a dis- 
torting channel that has been made baseband by the modulation- 
demodulation process inherent in the modem, noise is added, and the 
composite signal is sampled every T seconds after the receiver filter. 
The sampled signal vector x* is then equalized by a transversal filter 
with coefficient vector c (see Fig. 2) to produce an output yu = x*c 
upon which the decision device operates to produce estimates, dk, of 
the transmitted symbols. The receiver structure has the form of the 
optimum linear receiver 6 but, because the channel is never precisely 
known and changes with time, the transversal equalizer is made 
adaptive to optimize performance. 

Our concern in this paper is with the equalizer and ways to make it 
adapt rapidly from some initial setting to its final setting. A large 
number of papers, a partial list of which has been included in Refs. 
1 to 51, have been written about equalizers, algorithms for adjusting 




y =2^x_ K c K = x T c 
Fig. 2 — Nonrecursive transversal filter. 
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them, and the speed with which these algorithms converge. In de- 
veloping procedures for adapting the equalizer coefficients, some ap- 
propriate performance measure must be denned that will discriminate 
between good and bad coefficient vectors. Although our goal is to 
minimize the probability of error, this criterion is too difficult to work 
with directly. As a result, secondary performance measures such as the 
peak distortion, 2 

D = ho l £ \h\, (1) 

or the mean-square error, 

e = E{\y k -d k \"\, M = 2, (2) 

are used. In (1), h k is the sampled system impulse response. The peak 
distortion is related to the "eye opening," 6 and for binary symbols 
and noiseless transmission, D < 1 implies no decision errors. In (2), 
E{ ■ } is the expectation operation and y k — d k is the remaining error 
at the equalizer output. These performance measures (M could be 
greater than 2, if desired) can be shown to be convex functions of the 
equalizer coefficients, thereby proving the existence of a global 
minimum. 

We will work primarily with the mean-square error (mse) criterion. 
This criterion includes the effects of noise, whereas the peak distortion 
criterion does not, it is convenient to work with mathematically, it 
can be used to bound the probability of error, 52 and it leads to adaptive 
algorithms that are easy to implement. Using the mse, the optimum 
coefficient vector for the equalizer can be determined easily. Assuming 
E{d\\ = 1, we have from (2) 

e = c T Ac - 2c T v + 1, (3) 

where 

A = E{x k xl\ (4) 

is the signal autocorrelation matrix, 

v = E{d k x k \ (5) 

defines the signal correlation vector, and x k is the vector of tap signals 
at the ktb. time instant. Finding the gradient of (3) with respect to the 
tap gains gives 

g = 2E{ {y k - d k )x k \ = 2(Ac - v). (6) 

Our optimization problem has a unique solution if A~ l exists. Setting 
(6) equal to zero yields 

c opt = A-*v (7) 

e opt = 1 - v r c opt = 1 - VA^v. (8) 
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The problem of equalizer start-up is simply to find the solution to 
(7) in a rapid and economical manner. The economical part of the 
question is very important. One can imagine a start-up procedure that 
operates by sending a special training signal for a short period of time. 
The received signal, x(t), is stored at the receiver. The training se- 
quence is known at the receiver, but its absolute time reference is not 
known. The receiver contains a very fast high-power computer which 
now, in essentially no time, computes (8) for a large number of different 
time references and finds the time reference for the locally stored 
training sequence that minimizes e op t. The computer has thus ac- 
complished both synchronization and equalization. This hypothetical 
system achieves a start-up time limited only by the time required to 
transmit the training signal but, with today's technology, its speed-cost 
product, if you will, is very poor. It does not represent an economical 
solution to the problem. Many currently proposed fast start-up 
equalizers, although not as extreme as this example, still do not present 
cost-effective solutions. 

In addition to the economic aspect, this example illustrates two other 
important points. The first is the solution of (7) . Much of the work on 
equalization is concerned with efficient algorithms that avoid the direct 
matrix inversion and obtain an iterative solution. Often, however, the 
time required to perform the calculations in (4) and (5) is not explicitly 
considered in evaluating start-up time. Second, synchronizing the 
stored reference signal in the receiver can take significant time, and 
that aspect of start-up time seems to be universally ignored. 

Now we consider the solution of (7) in more practical terms. A well- 
known approach for solving (7) is 

c m+ i = c m - /3 m (Ac m - v), (9) 

i.e., a first-order steepest-descent gradient algorithm. For appropriate 
conditions on /3 m , c m converges to c opt . 

According to (6), the gradient is obtained by correlating the tap- 
signal vector and the error voltage 

g = 2E{e k x k \. (10) 

From an implementation point of view, this is a convenient quantity 
because the signal vector, x k , is readily available, and the error, 
Ck = yk — dk, can be estimated. A difficulty still remains in that the 
expected value is not available in real time and must be estimated by 
averaging over a finite number of symbols. The difference equation 
(9) then takes the form 

I mL+L-l "I 

c«+i = c m - j8 m - 7 £ x k (xlc m - dh){ nn 

J-> k=mL r- (11; 

= c m - m {A m c m - v m ) J 
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Averaging is done over L symbols between succeeding adjustments. 
If random data are transmitted, A m and v m will depend on the par- 
ticular signal pattern of each iteration interval and are random vari- 
ables with mean A and v and variances decreasing with longer averaging 
interval L. The analysis of the behavior of (11) is difficult, particularly 
when we try to determine ways of improving the convergence rate. 
By reducing L, we can make many more iterations in a given time, but 
we must use a smaller /3-value to take into account the larger variance 
of the calculated gradient. Longer averaging between each step would 
take more time but give a better estimate for the gradient, and there- 
fore allow a somewhat higher value of /3. M onsen 13 has studied the 
optimization of the averaging interval, assuming an ideal reference and 
Gaussian signals. In this special case, the optimum value is L = 1 ; i.e., 
corrections are made after each symbol and no averaging at all is 
done. This method is often called "stochastic approximation," because 
the corrections are stochastic quantities whose means equal the desired 
gradient. 

At this point, it appears that the mean-square algorithm with no 
averaging, i.e., 



= (I — p m x m x£)c m -|- /3 m d m x„ 

is an attractive scheme to investigate further to obtain fast real-time 
convergence. There remain, for the moment, two main difficulties that 
need further discussion. The first one is the problem of obtaining the 
data values dk. They can be estimated in the usual way from a threshold 
decision, but on channels with large distortion the initial error rate 
may be close to 50 percent and estimates dk are very unreliable in such 
a situation. An algorithm with a decision-directed reference may thus 
behave erratically, and convergence cannot be guaranteed. The results 
of a few simulations will give some further insight. 

The channel assumed for the simulation consisted of a 10-percent 
cosine roll-off baseband filter with parabolic delay distortion [5.4T 
at the Nyquist frequency (1/27 1 )] and a sampling offset of 0.3T from 
the peak of the response. The resulting channel response from a single 
pulse is shown in Fig. 3. This same pulse was also used by Hirsch and 
Wolf. 11 The initial peak distortion is 2.62, resulting in a completely 
closed eye pattern. 

The first simulation is for the algorithm (12), but an estimated 
reference (obtained from a threshold decision) was used. Figure 4 
shows the resulting peak distortion versus the number of symbols for 
four different step sizes. Decision errors are responsible for random 
distortion increases rather than reductions. This is avoided in Fig. 5 
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PULSE RESPONSE 

10% COSINE ROLL-OFF BASEBAND FILTER WITH PARABOLIC DELAY DISTORTION 

TIMING OFFSET FROM PEAK RESPONSE : 30% 

Fig. 3 — Impulse response with peak distortion of 2.62. 

where we have repeated the same runs with an ideal reference signal. 
The improvement is significant. Note that the ideal reference signal 
is really needed only until the peak distortion has decreased sufficiently 
to yield an open eye pattern ; from this time on, the error probability 
is essentially zero, and a decision-directed reference can be used. 

The difficulty in providing an ideal reference signal lies in the syn- 
chronization problem. Remember that we require such a signal only 
in channels with very large amounts of distortion, but achieving reliable 
synchronization in the presence of severe distortion is a problem in 
itself that usually calls for time-consuming correlation methods. 6 

A second problem is associated with the choice of the training 
sequence. Obviously, a strictly random data pattern would be a bad 
choice, since transitions would only occur on a probabilistic basis and 
not be guaranteed. The variability of repeated convergence runs would 
be large. This can be avoided by transmitting a short-period training 
sequence. Even if the starting point occurred at random, convergence 
would be more predictable. We know that we cannot make the period 
of the training sequence shorter than the duration of the impulse 
response of the equalizer; otherwise, the tap signals would not be 
linearly independent, the correlation matrix A would be singular, and 
a unique solution for the optimum tap vector c would not exist. We 
will, however, study the limiting case where the period, in symbols, of 
the training sequence is equal to the number of taps on the equalizer. 
This will lead directly to the idea of the cyclic equalization. Before we 
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Fig. 4 — Convergence behavior of stochastic adjustment algorithm (12) with a 
decision-directed reference. 



do this, we will provide some additional insight by a short discussion of 
the frequency domain aspects of the equalization problem. 

III. PRINCIPLE AND IMPLEMENTATION OF CYCLIC EQUALIZATION 

Let the spectrum of the received data signal be G(co) and assume 
that this signal is applied to an N tap equalizer with coefficients c„, 
n = 0, • • •, N — 1. The resulting output spectrum is 



X(w) = G(u) £ c„exp (-j'wnT), 

n=0 



(13) 
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Fig. 5 — Convergence behavior of stochastic adjustment algorithm (12) with an 
ideal reference. 

and the overall system would be distortion-free (Nyquist criterion) if 
Ex(<o + ^) = exp(-jW), |*| <|- (14) 

Combination of (13) and (14) yields the condition 

jv-i / 2irk\ 

£ c„ exp (-junT) £ G ( u + -^ J = exp (-jot), 



<f 



(15) 



Obviously, (15) cannot be satisfied for a finite A' and an arbitrary 
G(w). Usually, the coefficients c„ are chosen according to a minimum 
mean-square-error (mmse) criterion in the time domain, which is 
equivalent to an mmse criterion of (15) in the frequency domain. The 
problems of such an approach have been discussed in Section II, and 
we have seen that the commonly used iterative search schemes can be 
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very efficient during the tracking mode, but initial training may not be 
without problems. 

A closer look at (15) shows that the left-hand side is a linear com- 
bination of the coefficients c„. Although perfect equalization cannot 
be achieved at all frequencies, it is possible to obtain zero error at a 
number of specified frequencies co m within the range | a> m | < ir\T. This 
is, of course, also true with an mmse approach, since the resulting 
transfer function will oscillate around the desired one; i.e., the error 
will ripple between positive and negative values. The crossing fre- 
quencies are, however, not known and usually not of interest. In the 
new scheme we propose here, we will do exactly the contrary : We will 
precisely specify the crossing frequencies, although we realize that such 
an approach will, in general, not yield mmse. Specifying the frequencies 
w m where perfect equalization is obtained will transform the condition 
(15) in a set of linear equations for the coefficients c„. Obviously, we 
have to consider two cases : 

(i) N = even : N/2 different frequencies i» m 5* must be specified. 
(ii) N = odd: (N — l)/2 different frequencies u m 9^ and « = 
must be specified to obtain a unique solution for the c„'s. 

Theoretically, a set of reference tones w m could be transmitted, G(co m ) 
measured at the receiver, and the coefficients computed from (15). 
Fortunately, it is possible to propose a much more attractive solution. 

The generation of the reference tones can be accomplished in a 
straightforward way if we select the frequencies co m equally spaced 
across the Nyquist band ; a suitable periodic data sequence of length 
NT will produce such spectral lines at « ro = 2mn/NT. Note that the 
number of symbols in such a training sequence is equal to the number 
of taps of the equalizer. This choice is extremely important and provides 
a number of unique advantages to achieve fast equalizer start-up. 

We now discuss in detail such a training procedure. Assume an 
equalizer where an ideal reference signal is used and the period of the 
training sequence is equal to the number of taps on the equalizer. 
Assume for the moment that the channel is distortionless and the ideal 
reference is synchronized with the incoming signal. If we let the adap- 
tive algorithm adjust the equalizer taps, the center tap on the equalizer 
will become unity, and all the others will be zero. This is really what 
we mean when we say the reference is synchronized ; that is, the opti- 
mum equalizer coefficients are centered on the equalizer rather than 
shifted off to one end or the other. Now again, for this "ideal" example, 
if the reference signal is delayed by one symbol from perfect syn- 
chronization, the adaptive algorithm will cause the equalizer coefficient 
one position removed from the center to become unity, and all the 
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others will be zero. The movement of the unity gain tap by one position 
indicates a one-symbol delay in synchronization of the ideal reference. 
In an actual situation, the other taps on the equalizer will be nonzero 
and, with an unsynchronized reference, the adaptive algorithm will 
cause tap coefficients to occur that are cyclically rotated from those 
that would occur if the reference were synchronized. 

To say this another way, if the training sequence is periodic with a 
period equal in symbols to the number of taps of the equalizer, the 
received signal is then also periodic (neglecting noise effects), and one 
full period of the sequence is always stored in the equalizer. Each 
symbol that is shifted out at the end of the delay line is replaced by 
an identical new symbol at the input. This is more clearly shown in 
Fig. 6 for a seven-tap equalizer with taps C\ through Ci and a seven-bit 
sequence X\ through x 7 . At time t + 2T, it is seen that the stored 
sequence has been cyclically shifted by two units as compared to the 
time to. But it is also seen that the same output signal y{U + 2T) 
could have been obtained at time t = to if the taps were cyclically 
shifted back by two positions. Thus, at any given time t = t , all out- 
puts y{t = t + kT) can be obtained with a suitable cyclic shift of the 
components of the tap vector. 




x(t Q +2T) 




y(t + 2T) 




y(t +2T) 
Fig. 6 — Basic idea of cyclic equalization. 
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This feature provides an elegant solution to the synchronization 
problem. Any cyclic shift between the received sequence and the refer- 
ence sequence will yield a compensating cyclic shift of the (same) tap 
coefficients. It is, therefore, not necessary to achieve synchronization 
prior to equalization, but it is of course necessary to properly shift the 
tap coefficients after initial training to prepare the equalizer for random 
data. This can easily be done by cycling them in such a way that the 
largest coefficient is aligned with a reference position, e.g., the center 
tap. Because of its particular features just described, we will call this 
novel start-up scheme "cyclic equalization." 42 ' 43 

The possible structure of such an equalizer is outlined in Fig. 7. 
An internal word generator produces an ideal reference sequence that 
need not be synchronized with the received sequence. All taps are 
initially preset to identical values (since the location of the "center 
tap" is not known). The equalizer will then produce a set of taps with 
the particular cyclic shift corresponding to the "synchronization 
delay." After this initial training, the tap coefficients are cyclically 
shifted for "alignment," as outlined above. At this point, the equalizer 
has reasonably good tap coefficient settings and the peak distortion at 
the output is less than unity; i.e., the eye is open and, in the absence 




OCLOCK 



Fig. 7 — Block diagram of an equalizer with cyclic start-up. 
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of noise, errorless decisions can be made. Fast coarse adjustment of 
the tap coefficients has been achieved without wasting time synchro- 
nizing the ideal reference. Once the eye is open, decision-directed 
equalization can be used with a somewhat longer training sequence 
or random data to achieve the final fine adjustment of the tap 
coefficients. 

The fact that mean-square equalization with a training sequence 
period equal to the length of the equalizer can give very fast and very 
consistent, relative to the starting point of the adaptation, equalization 
has been demonstrated in numerous simulations. One of these is il- 
lustrated in Fig. 8. The same channel is used for this example as was 
used previously; the peak distortion is 2.62, the signal-to-noise ratio 
is 30 dB, and the step size is 0.04. In this case, the equalizer has 15 
taps and a 15-bit maximum length training sequence is used because 
of its nice spectral properties. Adjustments are made at the symbol 



CYCLIC EQUALIZATION, 15 TAPS-1 5 BITS 

INITIAL PEAK DISTORTION = 2.62 
S/N = 30 dB, P = 0.04, S = 0.05 




15 

SYMBOLS 



Fig. 8 — Start-up behavior with cyclic equalization. 
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rate. The shaded region in the figure contains all 15 possible con- 
vergence curves that correspond to the different starting points for 
adaptation. Not only are all the convergence curves very similar, but 
they all achieve a peak distortion of about 0.4 or less in 15 symbols. 

A few words are in order about the presetting of the tap coefficients. 
Because an unsynchronized reference is used, the location of the 
largest coefficient is a priori unknown. It is therefore reasonable to 
preset all coefficients to identical initial values s, as we have already 
mentioned above. With most channels, tap coefficients of both polari- 
ties will evolve so that one might consider setting s = for an unknown 
channel. The large final value of the center tap would, however, suggest 
that slightly biased initial conditions might give faster convergence; 
we will make more precise statements about that in Section VII. 

The discussed method of presetting has, of course, some consequences 
if a channel with low distortion or even an ideal channel were used. In 
such a situation, a conventional equalizer could do a better job because 
it would be started with the optimum tap settings (c k = 8 0k ) right 
away and need not make any corrections at all. The cyclic equalizer 
would have to "converge" even with an ideal input signal; simulations 
of this case have shown a convergence plot similar to that of Fig. 8. 

As a final example, we present the results of a vsb system that is 
operated over a channel with "parabolic-like" delay (exponent = 2.73) 
and an s/n of 30 dB. The received and demodulated signal is sampled 
with different timing phases spaced T/4 apart and equalized in a cyclic 
equalizer with N = 15 taps. The distortion values resulting after 
equalization during only one sequence (i.e., 15 symbols) are sum- 
marized in Table I. For comparison, the initial channel distortion 
^channel and the minimum distortion D m i„ that can be achieved with 
an equalizer of this length are also included. It can be seen that initial 
training using cyclic equalization achieves a performance that is 
already close to optimum. 

Some comments should be made about the simulation results we 
have presented. They indicate that initial training with cyclic equaliza- 
tion may only be necessary for a very short time; in some cases, for 
only one sequence period. This means that the received signal is not 



Table 1 - 


-Distortion for VSB channel and 15-tap equalizer 


Timing 


Dchannol 


Dcyel. 


D mia 



25% 
50% 
75% 


2.04 
1.87 

2.25 

2.88 


0.15 

0.52 
0.99 
0.25 


0.06 
0.21 
0.98 
0.12 
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really periodic and that no spectral lines in the strict sense will occur at 
equally spaced frequencies co m , as we specified earlier in this section. 
The spectrum will be continuous, showing increasingly concentrated 
peaks at those frequencies with larger numbers of sequence repetitions. 
We have not found this to be a disadvantage ; in fact, under some cir- 
cumstances, the tap settings achieved with only a small number of 
iterations were, for the transmission of random data, preferable to the 
steady-state solution. 

We have shown by example that fast reliable initial convergences 
can be achieved using an ideal reference signal without spending any 
time to synchronize the reference. Final fine adjustment of the taps is 
accomplished in a decision-directed mode using a longer sequence or 
random data. In the next sections, we will analyze the behavior of the 
cyclic equalizer during its initial training period. The convergence be- 
havior with the mean-square algorithm with averaging, the choice of 
the training sequence, and the effect of the initial value of the taps will 
be considered. Then the exact behavior of the mean-square algorithm 
without averaging will be analyzed and conditions for convergence 
will be given. 

IV. STEADY-STATE SOLUTION FOR THE TAPS 

As was discussed in the previous section, the operation of the cyclic 
equalizer does not depend upon the synchronization of the reference, 
and we will not stress the rotation property of the taps unless necessary. 

We assume a system with N equalizer taps and let the samples of the 
received signal be the components of the vector 

x r = (tat, Yat-i, • • -,7i)- (16) 

If we neglect the noise components, the tap-signal vector is periodic 
and successive vectors are cyclic shifts of each other (jN+m = y m ). We 
define a signal matrix 



S = 



7n 


7AT-1 


7iV-2 ' 


• 7i 


7i 


7.v 


7AT-1 • 


• 72 


72 


7i 


7N 


• 73 



7JV-1 7n-2 7jv-3 • • * 7jv 



(17) 



whose rows consist of all N succeeding sample vectors. The elements of 
S are given by 

Sik — 7(t"-fc)Mod n- (18) 

At the equalizer output, a sequence of values x r c (c is the tap- weight 
vector) appears as the input vector x is cyclically shifted through its 
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N states. The resulting output sequence is 

y = Sc. (19) 

Obviously, it is possible to obtain from a given input sequence any 
arbitrary desired output sequence by a suitable choice of c, provided 
only that <S _1 exists. If we define a data vector ? which contains the 
reference values associated with y, it is possible to select c so that 

y = i = Sc, (20) 

i.e., the recovered sequence can be perfectly equalized (at least, at the 
sample points) and there is no residual error. This is even true with 
nonlinear distortion. Since the error can be reduced to zero, we con- 
clude that the same tap vector 

Co = S-*Z (21) 

is obtained with any equalizer in the steady state, regardless of the 
particular tap-updating algorithm (as long as it is unbiased). 

We now proceed to determine the eigenvalues of the circulant matrix 
S. Let us first define a set of values r so that 

r tf = l- rA = exp(;^Y (22) 

In the next step we form 

X = 7jv + rjN-i + r 2 y N -2 + • ■ • + r^yi 
r\ = 7i + ry N + r 2 y N -i + • • • + r ar "" 1 7i 

r Ar-i x = 7JV-1 _j_ rjN _ 2 + r 2 yN _ 3 + . . . _j_ r tf-i TAr . 

This may be written in matrix form as 

r k \ k = Sr k , (23) 

where we have defined the vector 

n = \r kn } ; with r kn = ^= exp (j Y nk )' 

The r*'s are obviously eigenvectors of S. The eigenvalues A* are 

X* = x% O^k^N-1, (25) 

and are given by the discrete Fourier transform (dft) of the input 
vector x. The signal matrix S can be diagonalized if we introduce a 
matrix W with 

(WU-^exp (;£*), (26) 
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(24) 



whose columns are made up from the vectors r k . W is also symmetric 
and unitary ; the properties 



W = W T , W* = W\ WW = I 



(27) 



are easily established. We may now alternatively either express the 
eigenvalues as components of the diagonal matrix D 



D = W*SW 



or as components of a vector 



* = Wx, 



(28) 



(29) 



since multiplication with W transforms a vector into its dft. 

We now give an interpretation in the frequency domain. The received 
(periodic) sequence can be expanded into a Fourier series 



x ® = v# £ Xm exp y wt mt ) ' 



(30) 



where the coefficients x m correspond to the spectral lines and the range 
of m is determined by the bandwidth. The components 7, in x are 
given by x(t = r + iT); this may be combined with (25), and we 
obtain for the eigenvalues the frequency domain representation 



A* = L X 



k+W 



exp[^ r (z + £)] 



(31) 



In the case where all spectral lines are contained within twice the 
Nyquist frequency and t = 0, we have 



Xo = X_ff -f- Xq -}- 

Xl = Xj -\- Xi_n 

X2 — X 2 + %2—N 

X^_o = X A r_ 2 + X_ 2 

XjV-1 = X^y-1 + X_x 



(32) 



As this represents 100-percent excess bandwidth, we may assume that 
most practical systems are within this range. If spectral lines are only 
within the Nyquist limit, (32) is simplified to 



X* = X* if 

Xfc = X A r_ fc if 



1*1, f 



(33) 
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We can now give some comments as to the nature of the resulting 
tap vector c . By combining (21) and (28), c may be expressed as 

Co = WW-Wl (34) 

Here the term W% is the dft of the ideal samples and establishes a set 
of reference values at equally spaced points in the frequency domain 
(discrete Nyquist equivalence). The multiplication with D~ l determines 
the gain of an ideal correction function at these points. The resulting 
tap-vector c is the inverse dft of this correction function. The overall 
transfer characteristic (channel and equalizer) is discrete Nyquist 
equivalent when c = S~% i.e., frequency-domain equalization is 
precise at a set of equidistant points [spacing (2»/JVT)]. This tap 
vector is, in the general case, not optimum for random data trans- 
mission after the training period. Basically, equalization is a mathe- 
matical approximation problem. The equalizer approximates the com- 
pensation function with a trigonometric polynomial. With a cyclic 
equalizer, the coefficients are selected to match the desired function 
at equidistant points. This will generally not give minimum mean- 
square error at the output, since only discrete frequency information is 
used and the channel behavior between the sample points is not taken 
into account. In a recent paper, Chang and Ho 40 briefly discussed this 
problem from a somewhat different point of view and concluded that 
the initial approximation c is generally close to the optimum settings 
for random data. We will not further discuss the approximation prob- 
lem in this paper. 

V. MEAN-SQUARE ALGORITHM WITH AVERAGING 

In this section, we are looking at a tap-control system that minimizes 
the mean-square error between the equalizer output y(nT) and refer- 
ence symbols £„. We use a steepest descent gradient algorithm of the 
form 

c m+ i = c m - /3(Ac m - v), (35) 

where A is the signal-correlation matrix and v is the signal-correlation 
vector. The gradient 

g m = Eixiiyi- £,-)Hc=c m (36) 

is evaluated in the usual way by time averaging. 

First we note that, in the noiseless case, because of the cyclic nature 
of x, both A and v can be determined by time-averaging over one full 
sequence length of N symbols. Further, A and v are constant and well- 
defined throughout the process. It is easily verified that A can be 
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expressed in terms of the normal signal matrix S, 



A=±SSt = ±StS, (37) 



and that v is equivalent to 

*-#*«■ (38) 

The gradient is zero, and updating stops when 

c = Co = A-W = S~% (39) 

If we introduce the tap error vector S TO = c m — c , (35) takes the form 

5 m+1 = (/ - M) m »i. (40) 

The choice of /3 and the convergence depend on the eigenvalues of A. 
To guarantee that 5 m+ i -» for large m, we require < ft < 2//i max , 
where m are the eigenvalues of A. Since 

S = WDW* (41) 

and therefore 

A = 1 S& - i WDDiW*, (42) 

the eigenvectors of A and S are common (and independent of x) . The 
eigenvalues n k of A are related to the eigenvalues X* of S by 

M* = ^ XX (43) 

Another interpretation is obtained by realizing that the matrix A is 
circulant (like S) and symmetric with elements 

I l+N-l 

[A}ik = Ui-k = dk-i = a n = T5 L 7m7»-n. (44) 

iV m =/ 

By analogy to (25), the eigenvalues are 

M* = a r r A ; ^ fc ^ .V - 1, (45) 

where a contains the components a». We see that the eigenvalues are 
given by the dft of the cyclic autocorrelation values, a„. 

We now express these eigenvalues by the spectral lines in the fre- 
quency domain. If we combine (31) and (43), we obtain 



M* = t-LI x*+mArXi+ n .v exp 



j?^(m-n)J (46) 
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or, in an equivalent form, 

Mi = Tr £ £ Xk-mNX{n+m)N-k eXD f j&Tll ^ J ■ (47) 

If the bandwidth is Nyquist-limited, only one single term in (47) con- 
tributes to the eigenvalues, namely, 

*-*»— -^l*!", fc^- (48) 

The eigenvalues are then independent of timing and carrier parameters 
and phase distortion of the channel. Only the signaling format, the 
channel attenuation, and the choice of the sequence £ determine the 
eigenvalues. (Note that we have so far not restricted the choice of £ 
to a particular class of sequences, such as maximum length sequences.) 
In the case of excess bandwidth which is, however, limited to twice 
the Nyquist frequency (all reasonable pulse-amplitude modulation 
systems fall in this category), a few more terms in (47) need be con- 
sidered, and 

M* — I X k | 2 + | Xtf-k | 2 

+ X k X N _ k exp (j ^ ) + *X-t exp ( -j <?y ) » ( 49 ) 

which shows the influence of the timing phase t. Note that only the 
third and fourth terms depend on phase and timing parameters. This 
term represents the fold-over around the Nyquist frequency.* The 
smaller the roll-off, the less the eigenvalues will be affected by this 
fold-over. In fact, it is even possible to have a small amount of excess 
bandwidth without any contribution of these terms. This is shown in 
Fig. 9. We distinguish two cases. 

(i) N = odd. The Nyquist frequency is located midway between 
two spectral lines of the training sequence. Fold-over is avoided 
if we have a normalized roll-off a ^ 1/N. 

(ii) N = even. The Nyquist frequency coincides with a spectral 
line of the training sequence. If we choose a ^ 1/iV, the 
eigenvalues will still be phase-invariant, but one of them (for 
k = N/2) will now be dependent on the timing phase t . 

Most voice-grade telephone channels have very large phase distor- 
tion, but only moderate amplitude distortion. Usually, the worst-case 
gain deviations over a given frequency range are known (e.g., on 



* Some crossterms in (49) are zero if the bandwidth is less than twice the Nyquist 
frequency. 
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Fig. 9 — Spectrum of training sequence. 

private channels). If we deal with small excess bandwidth and we 
know our sequence £, it is obviously possible to calculate the spread of 
the eigenvalues from (46) or (49) . We may then choose the value of /3 
in (35) so that 

9. 



< j8 < 



(50) 



to insure convergence. In addition, it may, of course, be necessary to 
normalize the signal power x T x with an automatic gain control to make 
the eigenvalues dependent only on the relative gain difference between 
the various frequencies, but independent of the average absolute gain. 

VI. CHOICE OF THE TRAINING SEQUENCE 

So far, we have not discussed the choice of the training sequence £. 
From the previous study we know that the eigenvalues of S and A 
are well-behaved as long as the dft of x, or of the sampled autocorrela- 
tion function, respectively, has no zero elements. This is obviously 
sufficient to guarantee the existence of inverses of 8 and A and there- 
fore also the existence of a solution c . Zero elements can be avoided 
by selecting a signaling format and a sequence £ to insure nonzero line 
amplitudes at all frequencies /„ = n/NT within the transmission band- 
width. If the channel does not have serious attenuation gaps, we have 
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also nonzero amplitudes at the receiver input. To obtain fast con- 
vergence, the eigenvalues should be as equal as possible (minimum 
spread). This can obviously best be achieved by selecting a sequence 
£ which produces lines of equal amplitudes ; predistortion for expected 
attenuation at the band edges is possible. The transmitter is then 
effectively sending a comb of equally spaced frequencies of approxi- 
mately equal amplitudes that could obviously also be provided by a 
number of frequency generators, but are, of course, much more effi- 
ciently synthesized as spectral lines of a suitable sequence. Note that 
the samples of the training sequence £ need not necessarily be binary ; 
arbitrary numbers (and sequence lengths) can be stored in rom's in 
both transmitter and receiver. We will discuss a few special cases for 
I, assuming small excess bandwidth (<1/N) and an odd number of 
taps and flat gain : 

(i) Single pulse, £ r = (0, • • • , 0, 1, 0, • • • , 0) : This produces a fre- 
quency comb of equal amplitudes. We have further 

A = i/, X* = const = l/N, jSopt = N. (51) 

Convergence is obtained in a single iteration, independent of 
the initial settings. See eq. (40). 
(it) Single pulse, l T = (1, ■ • •, 1, -1, 1, • • •, 1) : This produces a 
similar frequency comb, but with a much larger amplitude at 
dc. The eigenvalues are shown to be 

X = (N - 2Y/N; Xi = ■ ■ ■ = \ N -i = 4/JV. (52) 

(in) Maximum-length pseudorandom sequence: Such sequences 
have lengths N = 2 m — 1 (among others), and were used for 
the simulations given earlier in Section II. The eigenvalues are 

X = i; \t =X*_ 1 = l+±. (53) 

For a given symbol magnitude of the £,'s and a given peak power, the 
maximum-length sequence gives the largest spectral line energy and 
seems thus to be a good choice, especially for noisy channels. Both in 
(ii) and in (Hi), X„ is different from the other N — 1 identical eigen- 
values ; we will, however, show that /3 can be selected according to these 
(N — I) values and that X does not affect the convergence if the 
equalizer is properly preset. 

VII. PRESETTING THE TAPS 

Since the reference sequence is not synchronized with the received 
signal, the resulting tap vector may have its main tap in any position, 
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and it would obviously not make sense to preset in the traditional way 
of having c, = 8 io . Instead, we choose an initial tap vector su whose 
coefficients have equal values s. If we assume N = 2M + 1 taps, the 
initial equalizer transfer function is given by 

H{0,) ~ S n }lM e ~ S sin ( w !T/2) ' (54) 

which is a comb filter with period 1/T and attenuation poles at 
/ = k/NT, i.e., precisely at the frequencies where the spectral lines 
of the training sequence are located. Only dc information is thus 
transmitted to the output prior to the first iteration. This is also obvious 
from the fact that the output y = sx T u does not depend on the cyclic 
shift of x, since it is the sum of all N sequence samples. If an ideal 
Nyquist pulse is applied to such a system, the initial distortion of the 
output signal is very large, i.e., 

-Dpeak = -D MSE = JV - 1. (55) 

This is independent of s. If, however, we look at the average mean- 
square error of the training sequence, we have with the initial setting 

«i = ft £ («fu - «■ 

- s 2 (x T u) 2 - 2 £ (x*u) (£*u) + I (rO- (56) 

This can be differentiated with respect to s, and we find that the initial 
mean-square error is minimized if we choose 

pt Nx T u { } 

The quotient associated with 1/A r represents the dc gain of the channel 
and is usually close to unity. Since the dc gain of the equalizer is equal 
to the sum of the tap coefficients, (57) means that the initial settings 
should be chosen to have the same sum as the final settings in c 
(remember that c is the inverse dft of the correction function) . 

Some further physical insight is obtained if the mean-square error 
after m iterations is studied. This mean-square error may be expressed 
as 36 

4+i = 2? Qi. m+h (58) 

i=0 

where the ith. error component, g», m +i, is given by 

g<**i - wWftWi -/sx,-)"\ (59) 
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The initial value of each component is proportional to its corresponding 
eigenvalue and to the square of 

ifn = m T Ti - c r r„ (60) 

where 

if i ^ , A1 * 

Jn if i = o. (61) 

The values of 5fr, are then obtained as 

Ifo = -c r r,- = { -DFT (Co)} i * (62) 

8fr = N~HNs - c r u) if * = 0. (63) 

Because of what we have said earlier, we see that these coefficients are 
proportional to the values of the correction function at the line fre- 
quencies, except in the case of i = 0, where 5fr is only proportional to 
the misadjustment at dc. If we select s according to (57), the error 
component associated with X„ becomes zero. The constant /3 is then 
selected in accordance with the remaining eigenvalues to provide fast 
convergence. 

A few comments are in order for the case of Mo = 0. This will occur 
whenever the sequence is dc-free. An example of this property is a 
maximum-length sequence that is complemented by one additional 
bit to provide an equal number of ones and zeros (N would then be 
even). From (59) we see that the error term associated with /x D is zero; 
since there is no spectral line at dc, the gain at co = is obviously im- 
material as long as we transmit the training sequence, and convergence 
and mean-square error are independent of the choice of s. To see how 
this affects the solution c , we write the relation Ac = v in the form 

DDWCo = NW^v. (64) 

Assume k eigenvalues in the diagonal matrix DD* are zero.* Therefore, 
we have only N — k linear independent equations for the N com- 
ponents of c . The set of solutions for c can be expressed with k inde- 
pendent linear parameters. In the most important case where only 
Ho = 0, this ambiguity can be avoided easily by constraining the sum 
of the tap values. This sum remains constant throughout the equaliza- 
tion process. We can best show that if we look at the sum of the 
gradient components, 



* This can happen if the test sequence has zero power at some frequencies 
/* = k/NT within the transmission band. 
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which is obviously zero since u r x, is zero by definition. We would thus 
choose s in such a way as to match the desired dc gain (which is no 
longer immaterial if we transmit random data after the initial training 
period). This is still in accordance with (57) if the quotient of the 
right-hand side is replaced by the quotient of the spectral densities 
at dc when data are random. 

VIM. INFLUENCE OF NOISE 

So far, we have made the assumption that the received samples are 
noiseless. We give here a coarse analysis of the effects of noise which 
will show that its influence is, in fact, quite small and may often be 
neglected. We assume that the taps are calculated from a single-input 
signal vector which includes noise; that is, the vector x in (16) now 
consists of the received signal values plus noise samples. As the equalizer 
cannot make any distinction between signal and noise components, a 
tap vector, 

c or = (S + R)~% (65) 

will evolve instead of c = S~ l £, where R is a noise matrix defined in 
accordance with (17). We then write the tap difference vector as 

C - C or = S-'RCor (66) 

if we combine (21) and (65). If a noiseless test sequence were trans- 
mitted over the system, there would be some output error because the 
vector c or is different from the optimum c . The resulting mean-square 
error, averaged over the ensemble of c or 's, would be 

e 2 = £{(c -c or )tA(c -c or )}, (67) 

and can be written as 

e 2 = E { cj r flt (S-yAS-iRCr } . (68) 

If we make use of the relation (37), this can be simplified to 

e 2 = ^ElciRtRc or }. (69) 

Assuming that succeeding noise samples are uncorrected and that 
|c or | 2 £^ 1, we finally obtain for the mean-square error 



where <r 2 is the noise power. We conclude that for reasonable s/n's 
there will be only a small bias introduced because of the superimposed 
noise. 
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IX. CYCLIC EQUALIZATION USING A MEAN-SQUARE ALGORITHM 
WITHOUT AVERAGING 

The previous discussion has given an analysis of the process of cyclic 
equalization using the mean-square algorithm with averaging. Much 
of the insight developed there regarding the final tap values, the type 
of training sequence to use, etc., carries over to the equalization 
process which uses the mean-square algorithm without averaging. 
However, to be more precise we now will carry out an exact analysis 
of this algorithm. Because it permits a more simple implementation, 
it is the algorithm without averaging that will most likely be used in 
practical situations. 

Let the iV-component tap-signal vector at time t -f- kT be denoted 
by Xfc. In the absence of noise, succeeding signal vectors will then be 
related by 

x k+m = U m x k) (70) 

where U is an N X N cyclic shifting matrix of the form 



U = 






• 


• 


1 


1 


• 


• 








1 • 


• 






1 



(71) 





Note also that U is orthogonal and 

JJm = JJm±lN t 

The equalizer output at time t + mT will be 

y(t„ + mT) = c£x„, = c%U m x , 



(72) 



(73) 



where we have expressed the signal vector as a cyclic shift of one fixed 
state at start-up. We will drop the index on x from now on. 

Let d/c be the reference value of the data signal at t = t + kT. The 
mean-square error at t + mT is 



e m = E{{Wx-d m y}, 



(74) 



where m indicates any of the equally probable cyclic shifts of x and d. 
The expected value in (74) can be obtained by time averaging over 
i + 1 ^ k ^ i + N because of the cyclic nature of the signals under 
consideration. The gradient with respect to the tap weights is given by 



6e 

dc 



= 2E{e m U m x}. 



(75) 



In this section we make adjustments of c at each symbol interval and 
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use a non averaged approximation of (75) (the product of error and 
tap signals of the previous baud interval) for updating. Thus, our 
strategy becomes* 

c m+ i = c m - dU m x(cZU m x - d m ) 

= U m (I - 0xx T )U- m c m + 0U m xd m . (76) 

For convenience, we define a data vector £ which contains the reference 
values d m . We further define an .Y-dimensional vector, 

r={r,-}, r,- = S,-,, (77) 

containing zeros in all positions except in one reference (/cth) position 
("center tap")- We observe that 

d m = r T U«Z, (78) 

and we can write (76) in the form 

c m+ i = U m ZU- m c m + /3C/ m ^t/- m r, (79) 

where we have introduced for convenience 

Z = I - /3xx r (80) 

E = x^. (81) 

By solving the time-varying difference equation (79), the tap vector 
after m adjustments can be expressed as 

C M+ i = U m JQ m Cl + /3 "E Q k EU k —A ■ (82) 

The new matrix Q in (82) is defined as 

Q = ZU- 1 = (I - &zx T )U-\ (83) 

and will play an important role in further analysis. We can also easily 
verify the synchronization-invariant properties of (82). In fact, if we 
replace x by [/'x (introducing some arbitrary delay), we obtain 



c m+ i = U m+i |q-IMci + /3 V Q^C/*—r| , 



(84) 



but since we choose the initial Ci with equal bias values for all coeffi- 
cients, U~ { C\ = Ci, and (84) and (82) are identical except for an 
i-position cyclic shift of the resulting tap vector. 



* We are assuming is constant ; in practice, it might be desirable to make 
dependent on m. 
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X. SOLVING THE DIFFERENCE EQUATION 

Before we discuss (82) in more detail, we define 

TO— 1 

H m = (3 E Q k EU k (85) 

because sums of this type will be frequently needed in the subsequent 
analysis. Examples are 

H = 

#i = 0E 

H N = P E Q k EU k . 

Further, it is rather straightforward to show that 

H i+j = Hi + Q'HiU* = Hi + Q'HtW, (86) 

and, as a special case, 

Har+m = Hut + Q lN H n = H n + Q«E m U\ (87) 

i/u\r may be expressed in a more convenient form if we introduce a new 
summation index, iN + j, 

Hin = p E Q iN E 1 Q j EW. (88) 

i=0 »— 

The first series can be summed, and we obtain 

ffur - (I - Q w )(/ - Q^-'Hn, (89) 

where we have made the implicit assumption that / — Q N is non- 
singular (we will say more about that in a moment). 

We are now ready to study (82), which may be written as 

c m+ i = 17»{Q"Ci + HJJ~»t). (90) 

By setting m = IN + n and combining the first expression in (87) with 
(89), we obtain 

Cw+h-1 = t/"{Q w+n Ci + Q lN H n U-»T 

+ (/ - Q lN ){I - Q N )- l H N U-"r\. (91) 

For any nonzero value of /3, c m will not converge in the usual sense; 
however, it will reach a steady-state condition that does not depend 
upon the initial value of Ci. In order that this can occur, we require 

lim Q lN = 0, (92) 
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which means that we require the spectral radius p(Q) to be less than 
unity (all eigenvalues inside the unit circle). This will also guarantee 
the nonsingularity of I — Q N and thus the existence of (89). The first 
(transient) term in (91) will then converge to zero, which means that 
the steady-state solution is independent of the initial tap settings.* 
The second term wall also converge to zero and the steady-state value 
of c is 

c M . n = U»(I - Q*)- l E*V-*. (93) 

This solution is periodic in n; it is trivial to verify that, owing to the 
cyclic nature of U, 

C K ,n+N = c Wl „. (94) 

As an important special case we have, if n = 0, 

c* = (/ - Q N )- l H N i = Hjc. (95) 

After m = IN iterations, the tap vector is 

cw+i = Q lN d + (I - Q lN )(I - Q*)-W*r. (96) 

By combining (95) and (96), we can express the convergence with the 
error vector Cw+i — c^, 

Cin+i - c M = Q w (ci - O, (97) 

as a function of the initial error vector. This is a particularly simple 
form, which shows how the convergence is directly dependent on the 
eigenvalues of Q. The error vector is reduced by a factor Q N with each 
cycle of iterations. The eigenvalues of Q are functions of 13 and of the 
signal format and channel characteristics. We will study this problem 
in the next section. 

XI. CONDITIONS FOR CONVERGENCE 

The eigenvalues X and eigenvectors z of Q are determined by 

Qz = U~ l (I - (3xx T )z = Xz. (98) 

We can calculate (Xz) t (Xz) and obtain 

|X| 2 ztz = ztz - 2,-ztxx'z - /5 2 zt(xxt) 2 z. (99) 

Assuming normalization of the eigenvectors, we then require for 
stability that 

|X| 2 = 1 - 2/3|zt X| - + /3 2 ( X tx)|zt X | 2 < 1. (100) 



* If only a small number of iterations are used for training, C' should be chosen 
carefully, since c will then be a function of the tranY.it term as well. 
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If we now assume that z+x j^ 0, we get the simple condition 

< P < ~ (101) 

to ensure convergence of the tap coefficients. Note that the bound 
(101) depends only on the received signal power, which can be nor- 
malized with an automatic gain control. 

A completely different situation occurs if x is orthogonal to an 
eigenvector z; z + x = 0. It is easy to see from (100) that this would 
imply |X| = 1, regardless of /3. This case must be avoided and needs 
some special attention. 

We first conclude that z+x = implies that z is an eigenvector of 
both Q and U ; this is evident from (98). The next step is then to de- 
termine the eigenvectors y and eigenvalues n of the cyclic shifting 
matrix U. They are defined by the equation 

Uy = nj. (102) 

We introduce a unitary matrix W with elements 

w ik = -Lexp (-J]^*) » 0^i,k^N-l (103) 

and observe that 

{ WWW} ik = 8 ik exp ( -j jf \ (104) 

is diagonal. The eigenvalues of U are thus given by 

w-eip(-j^) *-0,l, ...,tf-l (105) 

and the corresponding eigenvectors are 

Yi = (tt>*0, Mil, Wa, • ■ ', Wi,N-l)- (106) 

We now define a vector h with values yfx, 



h = 



yo^x 



yX'-ix 



= Wx. (107) 



The components of h are thus simply the components of the discrete 
Fourier transform of x, and we can finally write our requirement 
z+x 9^ in the form 

JNhi = "t! x k exp (-j |*j^0 for * = 0, ■ ■ •, N - 1. (108) 
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This is not a serious restriction, since most channels will produce an 
x whose dft will have only nonzero components. Difficulties can arise, 
however, with frequency gaps of severe attenuation within the pass- 
band range, but this is a condition that needs special attention with 
any equalizer. Partial-response signaling does not satisfy (108), and 
we conclude that it cannot be used for cyclic equalization without 
changes in the equalizer structure or tap-updating algorithm. 

Before we leave the stability discussion, we would like to point out 
another aspect of our problem. By setting n = 1 and IN — » °o in (89) , 
we obtain 

tf» - QH„U = &E, 

or, after postmultiplying with U~ l , 

QH,, - H^U' 1 = -EU-\ (109) 

Matrix equations of the above type play an important role in stability 
and control theory (Lyapunov), and it is known that a unique solution 
of (109) exists only if Q and U~ l have no common eigenvalues. This 
would obviously also lead to our conditions (101) and (108). 

XII. ASYMPTOTIC BEHAVIOR 

The coefficient vector that minimizes the mean-square error of the 
received sequence is given by 

c = A-W, (110) 

where A and v are the signal-correlation matrix and the cross-correla- 
tion vector between this sequence and the reference. Our current 
strategy does not use the gradient (75) in a steepest descent algorithm; 
nor do we assume that jS decreases as the iterations proceed. Thus, it 
is to be expected that we obtain settings that are biased with respect 
to (110). We first write (95) in the form 

(7 - Q")c a = H N r. (Ill) 

From (70) and (83) we can express Q N as 
Q N = (7 - 0xatx£) • ■ • (7 - to!) (7 - /3x lX f) 

= I - I3Z x,xf + /3 2 £ XixTx k xI - ■■■ (-iyp"x N ■ • • xf. (112) 

i i>k 



If the signal matrix, 
is introduced, we have 



A = ElXiXf] = ± £ x.xF, (113) 



-£4-1 



I -Q N = pNA 7 - ^A- 1 £ XixTxtxZ 



(114) 
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We can expand the right-hand side of (111) in a similar way, 



ff*r = £ 



N 



t'=l j=i+l 



n {i-pxi*n 



Xidi 



= 0Nv - /3 2 e' T. xtfxidi + • • •, (115) 

»=i y=t+i 

where the signal correlation vector v is defined as 

v = E{x4i] = xr E **■ < 116 ) 

iV i=i 

Combining (113) to (116) and writing out only the first-order terms 
in fi/N yield 

c„ = {/ + £ A- 1 £ mfesl - • • 

- ^EwW- •••• (H7) 

The neglected terms in (117) are multiplied with higher powers of /3. 
It is, therefore, always possible to choose /3 small enough to make the 
linear term dominant. We can conclude that the resulting asymptotic 
tap vector differs from the mmse solution c opt by a bias which, for 
sufficiently small £, is directly proportional to /3 and may be made 
arbitrarily small. Very fast initial convergence can be obtained by 
choosing /3 large ; then /3 may be made smaller for the remaining itera- 
tions to reduce the bias error (gear shifting).* This will also reduce the 
periodic fluctuation of the tap coefficients in the final steady state. 

On the other hand, one should always keep in mind that the cyclic 
process is used only during the training time and that random data 
are used later on for adaptation. The tap vector that yields mmse for 
the training sequence generally does not minimize the mean-square 
distortion for random data. However, the work of Chang and Ho 40 
indicates that (for small 0) the results may not be significantly in 
error. It would be expected that, in the normal data set application, 
cyclic equalization would be used for enough cycles to achieve a good 
open eye; then a longer training sequence would be used, decision- 
directed, to determine the steady-state tap coefficients. 

XIII. ACCELERATED PROCESSING 

After these theoretical studies, we conclude our paper by discussing 
some more practical aspects of the signal-processing organization. 



* It seems possible that a continuous decrease of /3 during the iterations would 
yield superior results; we have, however, not analyzed this case. 
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More precisely, we present a somewhat modified implementation 
technique of cyclic equalization that will allow a further reduction in 
the initial training time. For this, we assume that the received sequence 
is not substantially corrupted by noise. In a highly dispersive channel 
with a relatively high s/n, this is a realistic assumption since inter- 
symbol interference is completely dominant and noise is of minor in- 
fluence at the beginning of equalization. Once the initial transients 
have settled, the receiver will thus see a train of continuously repeated 
identical sequences. No information is lost if one sequence length is 
stored in the receiver for further processing and the input is switched 
off. Such a system is depicted in Fig. 10. 




STORED REFERENCE SEQUENCE 



HIGH 
SPEED 
CLOCK 



Fig. 10 — Block diagram of cyclic equalizer equipped for accelerated processing. 
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For clarity, only three taps are shown. The samples of the data se- 
quence are entered into the delay line (shift register in the case of a 
digital equalizer) of the transversal filter while switches S are in posi- 
tion A. As soon as one full sequence is stored, i.e., when samples have 
reached the end of the delay line, switches S are moved to position B. 
Thus, a shift-register ring circuit is formed and the stored samples can 
be shifted cyclically by applying appropriate clock pulses. The stored 
reference sequence is shifted at the same speed. It is important to 
realize that this speed need not be related to the actual data rate. The 
stored signal vector and the reference sequence can be shifted at a 
much higher rate, thus simulating a "speeded-up" data flow. Initial 
training can be achieved in a time interval limited only by the speed 
capabilities of the signal-processing hardware. After going through a 
specified number of sequences, training is considered sufficient and the 
computed tap coefficients are cyclically shifted for alignment. All 
switches are then set to position A, received data are shifted down the 
transversal filter at the actual data rate, and further adaptive equaliza- 
tion is performed on a decision-directed basis. The described training 
method combines cyclic rotation of the signal vector, the reference 
vector, and the coefficient vector to simultaneously achieve equaliza- 
tion and synchronization in "speeded-up" time, i.e., virtually instantly. 

The above method is particularly simple when used with a stochastic 
adjustment algorithm. However, accelerated processing is also at- 
tractive with the mean-square gradient-type algorithm. Since the 
gradient is determined by averaging over N symbols, an additional 
array is necessary to store the accumulated correlation products of 
error and tap signals. The speeded-up data flow is again achieved by 
cyclically shifting either the signal vector or the coefficient vector at 
the highest possible rate consistent with the required signal-processing 
operations, only now the coefficient vector remains unchanged until, 
after one full cycle, the correlator array contains the (suitably scaled) 
tap corrections. The coefficients are now updated and the process is 
repeated, if necessary. After a couple of iterations the coefficients are 
rotated to align the largest of them with the reference position and the 
equalizer is switched to real-time processing and decision-directed 
operation. 

Even without accelerated processing, the initial training time using 
cyclic equalization is so short that the delay needed for the signal to 
initially "fill up" the transversal filter becomes significant. With the 
described method of accelerated processing, the training time can be 
reduced to an arbitrary short interval limited only by the speed capa- 
bilities of the circuit elements. The "fill-up" time becomes completely 
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dominant. In the extreme, cyclic training can be achieved within a 
single symbol interval (after the equalizer is filled up). 

XIV. CONCLUSION AND SUMMARY 

Cyclic equalization, as presented in this paper, is a new method for 
initial equalizer training. Its main features are : 

(i) A special training sequence where the number of symbols 

equals the number of equalizer taps. 
(ii) Very fast start-up with provision for futher speeded-up opera- 
tion, reducing training time theoretically to less than one 
symbol interval. 
(Hi) Ideal reference operation with no synchronization required. 
The processes of equalization and synchronization are com- 
bined in a unique way. 
(iv) Perfect equalization at a set of equally spaced points in the 

frequency domain. 
(y) Simple and economical implementation. 

Cyclic equalization provides a set of tap coefficients that need to be 
cyclically rotated after initial training. At this time, a coarse equaliza- 
tion is achieved, the eye pattern is open, and the equalizer can switch 
to a decision-directed mode to achieve final tap settings using random 
data. 

We have shown that the periodic training sequence can always be 
exactly equalized, so that all unbiased tap-updating algorithms will 
converge to the same tap settings, namely the inverse dft of the 
sampled channel correction function. The mean-square gradient algo- 
rithm was analyzed in detail. The channel correlation matrix eigen- 
values that influence the convergence are directly related to the lines 
of the power spectrum of the received sequence. The problem of initial 
coefficient presetting was discussed, and we made some comments con- 
cerning the choice of the training sequence and the influence of noise. 

The cyclic equalization process using the mean-square algorithm 
without averaging was considered, and the difference equation that 
describes the coefficient convergence was solved. It was proved that 
the algorithm converges provided that the discrete Fourier transform 
of the received signal vector has no zero elements, and that the step 
size is within certain limits related to the number of taps and the 
received signal power. Finally, it was shown that the tap coefficients 
for the algorithm without averaging equal those for the algorithm with 
averaging except for an error term which goes to zero as the step size 
approaches zero. 
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The paper has been concluded by presenting a signal processing 
technique that achieves "accelerated convergence." This allows co- 
efficient calculation in a time interval limited only by the speed 
capabilities of the equalizer circuitry. 
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